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Abstract 

An ancestry labeling scheme labels the nodes of any tree in such a way that ancestry 
queries between any two nodes in a tree can be answered just by looking at their correspond- 
ing labels. The common measure to evaluate the quality of an ancestry labeling scheme is 
by its label size, that is the maximal number of bits stored in a label, taken over all n-node 
trees. The design of ancestry labeling schemes finds applications in XML search engines. In 
the context of these applications, even small improvements in the label size are important. 
In fact, the literature about this topic is interested in the exact label size rather than just its 
order of magnitude. As a result, following the proposal of an original scheme of size 21ogn 
bits, a considerable amount of work was devoted to improve the bound on the label size. 
The current state of the art upper bound is logn + 0{^/\ogn) bits which is still far from 
the known logn + ri(loglogn) lower bound. Moreover, the hidden constant factor in the 
additive 0{\/logn) term is large, which makes this term dominate the label size for typical 
current XML trees. 

In attempt to provide good performances for real XML data, we rely on the observation 
that the depth of a typical XML tree is bounded from above by a small constant. Having 
this in mind, we present an ancestry labeling scheme of size logn + 21og(i + 0(1), for the 
family of trees with at most n nodes and depth at most d. In addition to our main result, 
we prove a result that may be of independent interest concerning the existence of a linear 
universal graph for the family of forests with trees of bounded depth. 



*This research is supported in part by the ANR project ALADDIN, by the INRIA project GANG, and by 
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1 Introduction 



1.1 B ackgr ound 

It is often the case that when people wish to retrieve data from the Internet, they use search 
engines hke Yahoo or Google which provide full-text indexing services (the user gives some 
keywords and the engine returns documents containing these keywords). In contrast to such 
search engines, the evolving XML Web-standard [21 133] aims for allowing more sophisticated 
queries of documents. By describing the semantic structure of the document components, it 
allows users to not only ask full-text queries (find documents containing the phrase "computer 
science researches") but also ask for more sophisticated data (find all items about computer 
science researches that did their Phd at ETH Ziirich and whose age is below 35). 

To implement such sophisticated queries, Web documents obeying the XML standard are 
viewed as labeled trees, and typical queries over the documents amount to testing relationships 
between document items, which correspond to ancestry queries among the corresponding tree 
nodes [21 [HI [Ml ES] . To process such queries, XML query engines often use an index structure, 
typically a big hash table, whose entries are the tag names in the indexed documents. Due 
to the enormous size of the Web data and to its distributed nature, it is essential to answer 
queries using the index labels only, without accessing the actual documents. To allow good 
performances, it is essential that a large portion of the index structure resides in the main 
memory. Since we are dealing here with a huge number of index labels, reducing the length of 
the label size, even by a constant factor, is critical for the reduction of memory cost and for 
performance improvement. For more details regarding XML search engines see, e.g., [Il[3l[7|. 

Labeling schemes which are currently being used by actual systems are variants of the 
following interval based ancestry labeling scheme [161 130j . Enumerate the leaves from left to 
right and label each node u by the interval [^s, ^i], where (.g (respectively, £;) is the smallest (resp., 
largest) leaf descendant of u. An ancestry query then amounts to an interval containment query 
between the corresponding interval labels. It is easy to see that the size of the labels produced 
by this simple scheme is bounded by 21ogn bits, where n is the size of the tree. 

A considerable amount of research was devoted to improve the upper bound on the label size 
as much as possible [U [3l [32] . The current state of the art upper bound [1] is log n + 0( ylogn) 
which is still far from the known logn -|- O (log log n) lower bound 0]. Moreover, the hidden 
constant factor in the additive 0{yJ\og n) term is large, which makes this term dominate the 
label size in the average size of current applications. Following that work, [T3] suggested other 
ancestry labeling schemes whose worst case bound is 1.5 logn -|- 0(1) but perform better than 
the scheme of [Ij for typical XML data. 

In attempt to provide good performances for real XML instances, we rely on the observation 
that a typical XML tree has extremely small depth (cf. [71 126^ [25]). For example, by examining 
about 200,000 XML documents on the Web, Mignet et al. [25] found that the average depth of an 
XML tree is 4, and that 99% of the trees have depth at most 8. Motivated by this observation, 
we concentrate on bounded depth trees, and prove an upper bound of log n + 2 log d + 0(1) for 
the size of an ancestry labeling scheme for the family of n-node trees whose depth is bounded 
by d. (In fact, our bound holds even for forests rather than just for trees.) 

It is not clear whether one can adapt the techniques from previous schemes to perform better 
on trees of small depth. For example, the simple interval scheme has label size 2 log n also for 
trees with constant depth. As another example, before starting the actual labeling process, 
the ancestry scheme in [T] first transforms the given tree to a binary tree. This transformation 
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already results with a tree of depth r2(log n), even if the given tree has constant depth. Moreover, 
previous relevant schemes extensively use and rely on a specific technique, for using alphabetic 
codes on different subpaths. This technique, at least on its surface, does not seem to be more 
effective on short subpaths, than on long ones. 

In contrast, this paper uses a different technique that does not rely on alphabetic codes. 
Informally, the idea behind our scheme is the following. The labels of the nodes are taken 
from a small set of integers U, thus ensuring short labels. Each integer in U is associated with 
some interval taken from some limited range. The fundamental rule of our labeling scheme 
is that a node u is an ancestor of v if and only if the interval associated with the label of u 
(i.e., the corresponding integer in U) contains the interval associated with the label of v. That 
way, the ancestry query can be answered very easily, simply by comparing the corresponding 
intervals. The main technical challenge is to find a way to define and nest these intervals between 
themselves to be able to appropriately map the nodes of any n-node forest of bounded depth 
into U, while keeping U small. 

1.2 Other related work 

Implicit labeling schemes were first introduced in where an elegant adjacency labeling 
schemes of size 2 log n is established on n-node trees. That paper also notices a relation between 
adjacency labeling schemes and universal graphs (see also [HI fTUl 124)). Precisely, it is shown 
that there exists an adjacency labeling scheme with label size k for a graph family Q if and only 
if there exists a universal graph for Q with 2^ nodes. 

Adjacency labeling schemes on trees were further investigated in an attempt to reduce 
the constant factor in the label size. In [13] an adjacency labeling scheme using label size of 
log n + 0{^/^ogn) is presented; and in [S] the label size was further reduced to log n + 0(log* n) . 
This current state of the art bound implies the existence of a universal graph for the family of 
n-node trees with 2'^^^°^ i'^))n nodes. 

Labeling schemes were also proposed for other decision problems on graphs, including dis- 
tance [ilinKnillSKnilMlETlEI], routing P ESJ, Aow [211 in], vertex connectivity ^19^ tlTj, 
nearest common ancestor [SISH], and various other tree functions, such as center, separation 
level, and Steiner weight of a given subset of vertices [28]. See [H] for a survey on static 
labeling schemes. Dynamic labeling schemes were investigated in a number of papers, e.g., 
[H [201 [231 [22]. 

1.3 Our contributions 

We present an ancestry labeling scheme of size logn + 21ogd + 0(1) for the family of rooted 
forests with at most n nodes and depth at most d. Our result is essentially optimal for rooted 
trees with constant depth, and thus for the typical XML trees. 

As a corollary of our main theorem, we get an adjacency scheme of size log n + 3 log d + 0{l) 
for the family of forests with at most n nodes and depth bounded by d. This, in particular, 
implies the existence of a linear universal graph for the family of forests with constant depth. 
Namely, we show the existence of a graph of size 0(n) that contains all n-node forests of constant 
depth as vertex induced subgraphs. 
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2 Preliminaries 



Let T be a tree rooted at some node r referred as the root of T. The depth of a node u G V{T) 
is defined as 1 plus the hop distance from u to the root of T. In particular, the depth of the 
root is 1. The depth of T is the maximum depth of a node in T. Let u and v be two nodes in 
T. We say that u is an ancestor oi v \i u ^ v and u is one of the nodes on the shortest path 
connecting v and the root of T. 

A rooted forest F is a collection of rooted trees. The depth of F is the maximum depth of 
a tree in F. For two nodes u and v in F, we say that u is an ancestor of v if and only if u is an 
ancestor of v in one of the trees in F. For integers n and d, let JF(n, d) denote the family of all 
rooted forests with at most n nodes and depth bounded from above by d. 

An ancestry labeling scheme {M^V) for a family of rooted forests is composed of the 
following components: 

1. A marker algorithm M that, given a forest F in JF, assigns labels to its nodes. 

2. A polynomial time decoder algorithm T) that given two labels ^\ and ^2 in the output 
domain of M., returns a boolean in {0, 1}. 

These components must satisfy that if L{u) and L{v) denote the labels assigned by the marker 
to two nodes u and v in some rooted forest F & J^, then 

V{L{u), L{v)) = 1 <^=^ u is an ancestor of v in F. 

It is important to note that the decoder V is independent of the forest F. Thus V can be viewed 
as a method for computing ancestry values in a "distributed" fashion, given any pair of labels 
and knowing that the forest belongs to some specific family J^. 

The common complexity measure used to evaluate a labeling scheme (A^,P) is the label 
size, that is the maximum number of bits in a label assigned by the marker algorithm A4 to 

any node in any forest in J^. 

Given two integers a and 6, where a < b, let [a,b] (respectively, [a, 6)) denote the interval 
containing the integers i such that a <i <b (resp., a <i <b). Given a graph G, let |G| denote 
the number of nodes in G. 

3 A compact ancestry labeling scheme for J^{n, d) 

This section is devoted to proving the existence of an ancestry labeling scheme of size log n + 
2\ogd + 0{1) for the family of rooted forests in T{n,d). Informally, the scheme performs as 
follows. We construct a set of intervals U such that the nodes of any forest in !F{n, d) can 
be mapped to U , in a way that ancestry relation can be answered using a simple interval 
containment test. I.e., we make sure that u is an ancestor of v in some forest F if and only if 
the interval associated with u contains the interval associated with v. We call such a mapping 
an ancestry mapping. A label of a node in F is simply a pointer to an element in U, and thus 
can be encoded using log \U\ bits. Therefore, to get short labels we need U to be small. 

The construction of U is done by induction on the number of nodes in the forest. Assume 
that there exists some set of intervals , such that for any forest of size at most 2^ , there exists 
an ancestry mapping from F to U^, and consider now the set of forests Tk+i with at most 2^^^ 
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nodes. Of course, if every F G Tk+i would break nicely into two forests with at most 2^ nodes 
each, then one could embed the two parts separately on two interval sets U' and U" of the 
same size as U^- If that was always the case, we would ultimately get an interval set U = U\ogn 
of linear size for which any forest of size at most n could be embedded to U via an ancestry 
mapping, and that would yield an ancestry labeling scheme with label size logn. 

Fortunately, life is not so simple, and a forest F G J^k+i doesn't always break nicely to two 
equal size sub-forests. Specifically, problems occur whenever one must break a tree T of F into 
two parts and embed one part in U' and the other part in U" . Ideally, if F is broken into 
F' U F" U T where F' is embedded in I' C. U', and F" is embedded in /" C U", then one wants 
to embed T by borrowing what remains free in U' \ I' and U" \ I". This can be achieved by 
using various scales of sub-interval sizes, so that to embed T in J = J' (J J" with J' C [/' \ /' 
and J" C U" \ I". 

Two difficulties arise in this recursive approach. The first one is related to the scale of the 
sub-intervals in which one picks J' and J" . Indeed, too many sub-intervals yields too many 
intervals in Uk+i- On the other hand, too few sub- intervals yields too large gaps between /' and 
J' in U' . This prevents the intervals in ?7fc+i from being sufficiently compressed, and thus also 
ultimately results with too many intervals in Uk+i- Determining a good tradeoff between the 
amount of scaling in the sub- intervals, and the gaps between intervals, in thus one major issue. 

The second difficulty that is faced by the recursive approach is that splitting a tree into 
subtrees of sizes at most half is performed by removing the separator of the tree. However, one 
can see that whenever a tree T of 2*^+^ nodes is split into a collection Ti , . . . , of subtrees by 
removing the separator of T, the subtree containing the root of T plays a special role in the 
setting of the ancestry scheme. Dealing with this special subtree is a second important issue, 
for which the assumption on the depth of the forests will play a major role. The proof of the 
theorem below shows how to overcome these two issues. 

Theorem 3.1 There exists an ancestry labeling scheme for the family of rooted forests in 
T(n^ d) whose label size is log n + 2 log d + 0{1). 

Proof. For simplicity, we assume n is a power of 2. (If n is not a power of 2, we just round it 
to the next power of 2, say N, and we add N — n independent nodes to the forest). We begin 
by defining a set C/ = U{n, d) of integers, which we use later to label all forests in J^{n, d). 

Let Co = 1, and, for any i, I <i < logn, let 

i 

a = Q_i + i/z^ = 1 + 1] ^/f ■ 

i=i 

We have 1 + ^j>i < 3, and hence all the q's are bounded from above by 3. For any i, 
1 < i < log n, let us define the following values, that will be used to decompose integers: 

Hi = l + 3-n-d-iV2*"^ 
J j — 2 * * Cj^ ' % 

Then we define Fq = 3n, and 

i 

Fj = VQ + 'Y^Hj ■ Jj. 

i=i 

The set of integers U is defined as the interval 

U = [l,Fiog„). 
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Note that since Fiog^ = 0{nd?), we have \U\ = 0{n(P). The marker algorithm maps the nodes 
of any forest F G J'in, d) into the integer set U . To perform, the decoder algorithm represents 
each integer in [/ as a unique triplet (i, h,j), as follows. 



• An integer G [l. To) is simply represented by (0, i/, 0); 

• An integer u that satisfies Fj-i < < Tj for some 1 < ^ < logn can be described as 

V = Fj-i + hJi + j 

for unique h and j such that h G [0, Hi) and j G [0, Jj); Hence we represent such u by the 
triplet {i, h,j). 

For simplicity of presentation, in the following, we will not distinguish between an integer 
in U and its triplet representation, unless it may cause a confusion. Every integer in U is 
associated with an interval as follows. Let xq = 1, and for any i, I < i < logn, let 



Xi 



~ oi — 1 



For h G [0, Fq), we associate the triplet (0, /i, 0) G U with the interval /o,/i,o = [h]- For any i, 
1 < i < logn, any h G [0, Hi), and any j G [0, Ji), we associate the triplet {i, h,j) G U with the 
interval 

= i^i^^ Xi{h+j)). 

We now define a concept of specific interest for the purpose of our proof: 

Definition 1 Let F G J-'{n, d). We say that a mapping L : F ^ U is an ancestry mapping if, 
for every two nodes u,v & F with L{u) = {i,h,j) and L{v) = {i',h',j'), we have 

u is an ancestor of v in F ^i',h',j' ^ ^i,h,j- 



In order to show that there exists an ancestry mapping from every forest in J-'{n, d) into U, 
we shall make use of the following definitions. For any interval I C [1, Fq), let 

Uo{I) = {{0,iy,0) I z/G/} 

and, for any A;, 1 < A; < logn, let 

Uk{I) = Uo{I) U{{i,h,j)\l<i<k, he [0, Hi), j G [0, Ji) and Ii,h,j C 1} . 

The following observations are immediate by the definition of the sets Uk{I)- Let / and J 
be two intervals in [1, Fq). For any k, 1 < k < logn, we have: 

• / n J = ^ c/fc(/) n c/fc(j) = 0, 

• C/fc(/)UC/fc(J) C[/fc(/u J), 

• ICJ ^ Uk{I) C Uk{J), 

• Uk-i{I) C Uk{I). 
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Fix k such that < < log n. We now give a sufficient condition for the existence of an 
ancestry labeUng scheme using labels in Uk{I)- Let / be an interval in [1, Tq) and let Ii,l2, ■ ■ ■ ,It 
be a partition of / into t disjoint intervals, i.e., / = U*^]^/j with /j Cilj =0 for any 1 < i < j < t. 
Let F be a forest, and let Fi, F2, ■ ■ ■ , Ft be t pairwise disjoint forests such that uj^^Fi = F. 
Using the four properties listed above, one can easily prove the following. 

Claim 3.2 // there exists an ancestry mapping from Fi to Uk{Ii) for every i, \ < i < t, then 
there exists an ancestry mapping from F to Uk (!) ■ 

The following is the main technical ingredient for proving the theorem. 

Claim 3.3 For every k, < k < logn, every forest F of size \F\ < 2^ with depth bounded by 
d, and every interval I C [l,ro), such that \I\ = [cfc|F|J, there exists an ancestry mapping of 
F into Uk{I). 

We prove this claim by induction on k. The claim for A; = holds trivially. Assume now 
that the claim holds for k with < A; < logn, and let us show that it also holds for k + 1. 

Let -F be a forest of size \F\ < 2'=+!, and let / C [1, Fq) be an interval, such that |/| = 
[cfc+i|-F|J. Our goal is to show that there exists an ancestry mapping of F into Uk+i{I)- We 
consider two cases. 

The simpler case is when all the trees in F are of size at most 2^^. For this case, we show 
a claim stronger than what is stated in Claim 13. 3i Specifically, we show that there exists an 
ancestry mapping of F into Uk{I) for every interval / C [l,Fo) such that |/| = [cfe|F|J (i.e., a 
fraction l/{k + 1)^ of |-F| smaller than what is required to prove the claim). Let Ti, T2, • • • 
be the trees in F. We divide the given interval / of size [cfc|-F|J into i +1 disjoint subintervals 
/ = /i U /2 • • • U U /', where = [cfc|Tj|J for every i, 1 < i < i. This can be done because 
X]i=i [cfcl^ilJ ^ [cfcl^lJ = l-^l- By the induction hypothesis, we have an ancestry mapping of Tj 
into Ukih) for every i, \ < i < L The stronger claim thus follows in this case by Claim [321 

Now consider the more involved case in which one of the subtrees in F, denoted by T*, 
contains more then 2^ nodes. Our goal now is to show that for every interval /* C [l,Fo), 
where |/*| = [cfc+i|r*|J, there exists an ancestry mapping of T* into Uk+i{I*). Once we show 
this, we can, similarly to the first case, divide the interval / into 3 disjoint subintervals 

I = FUI'U I", 

where 

\r\ = Lcfc+i|r*|J and |/'| = [ck\F'\\ 

with F' = F \T* . Since we have an ancestry mapping that maps T* into Uk+i{I*), and one 
that maps F' into Uk{F), we get the desired ancestry mapping of F into Uk+i{I) by Claim 
(The ancestry mapping of F' into Uk{I') can be done by the induction hypothesis, because 
\F'\ < 2^). 

For the rest of the proof, our goal is thus to prove the following claim: for every tree T of 
size |r| with 2^ < \T\ < 2^+\ and every interval / C [l,Fo), where |/| = [cfc+i|T|J, there exists 
an ancestry mapping of T into Uk+i{I)- 

Recall that a separator of a tree T is a node v whose removal from T (together with all its 
incident edges) brakes T into subtrees, each of size at most |r|/2. It is a well known fact that 
every tree has a separator. Note however, that there can be more than one separator to a tree. 
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Nevertheless, if this is the case then there are in fact two separators, and one is the parent of 
the other. In the following, whenever we consider a separator of a rooted tree T, we refer only 
to the separator of T which is closer to the root. 

We make use of the following decomposition of T. We refer to the path S from the separator 
of T to the root of T as the spine of T. This spine may consist of only one node, namely, the 
root of T. Let Vi,V2, ■ ■ ■ ,Vd' be the nodes of the spine S, ordered bottom-up, i.e., vi is the 
separator of T and Vd' is the root of T. By this definition, we have that if 1 < i < j < d' then 
Vj is an ancestor of Vi. A separator is not a leaf if |T| > 1, and therefore 1 < d' < d. (Recall 
that the depth is 1 plus the distance to the root). By removing the nodes in the spine (and 
the edges connected to them) , the tree T brakes into d' forests Fi,F2, - ■ ■ ,Fa', such that the 
following holds for each 1 < i < d': 

• in T, the roots of the trees in Fi are connected to vf, 

• each tree in F^ contains at most 2'^ nodes. 

The given interval / for which we want to embed T into C/fc+i(J) can be expressed as I = [a, b) 
for some integers a and b, and we have 

b-a=\I\ = [ck+i\F\\. 

For every i = 1, . . . ,d' , we now define an interval /j (later, we will map Fi into JJkih))- Let us 
first define Ii. Let hi be the smallest integer such that a < hix^+i, and let be the smallest 
integer such that [cyk|Fi|J < hiXk+i- Note that hi > 1. We let 

Ii = [hiXk+i, {hi + hi)xk+i). 

Assume now that we have defined the interval 

Ii = [hiXk+i, {hi + hi)xk+i) 

for 1 < z < d'. We define the interval I^+i as follows. Let /ij+i = hi + hi and let ^j+i be the 
smallest integers such that Lcjfc|Fj+i|J < hi+iXk+i- We let 

li+i = [hi+iXk+i, {hi+i + hi^i)xk+i)- 

Observe that for 1 < i < d', the interval Ii is simply Ik^i^h^^h^- Note also that for every 
i = 1, . . . ,d' , we have 

hiXk+i < [ck\Fi\\ +Xk+i. 

It follows that the size of Ii at most Lcfe|Fi|J + x^+i — 1. Thus, since hiXk+i < a + x^^i, we get 
that 

d' 

[jli ^ [a, a + (d' + l)(xfe+i-l) + Lcfe|r|j) 
1=1 

C ^a, a + d-{xk+i-l) + lck\T\\y 

Now, since d ■ (x^+i — 1) < |^ (^k+iyi > and 2*^ < |r|, it follows that, 

d' p 

[Jli C a,a+ TT^TTv^ + (^k\T\ 



i=l 



a,a+ — —75- 

= [a,a+lck+i\T\\) 
= I. 
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On the other hand, note that for 1 < i < d' , li contains at least [cfc|i^i|J nodes. Therefore, by 
the fact that, for any i, 1 < i < d' , each tree in Fi contains at most 2^^ nodes, we get that there 
exists an ancestry mapping of each Fi into Uk{Ii). We therefore get an ancestry mapping from 
all Fj's to Uk{I)i by Claim [321 It is now left to map the nodes in the spine S into Uk+i{I), in 
a way that will respect the ancestry relation. 

For every i, I < i < d' , let hi = YTj=i ^j- We map the node Vi of the spine to the triplet 
(A; + 1, hi,hi). 

Let us now show that (A; + is in Uk+i{I)- First, the fact that -^fc_|„x /ii X — ^ 

follows from the fact that I^^-j^ hih ~ ^j=i-(j ' using ljj=i — ^ It remains to show that 
hi G [0, and that hi G [0, Jfc+i). Note that. 



3nd(k + 1)2 



d{k + l) 



{Hk+i — l)xfc+i. 



Therefore, the smallest integer hi such that a < hix^+i must satisfy hi G [0, -ff^+i). Recall now 
that for every i, 1 < i < d' , hi is the smallest integer such that [cfc|i<i|J < hiX^+i. Thus 



Xk+l 



It follows that. 



^ Xk+l Xk+l Xk+l 



Thus 



^hj <d + 2ckd{k + 1)2 < 2ck+id{k + 1)^ = Jk+i. 
i=i 

Therefore hi £ [0, J^+i). 

We now show that our mapping is indeed an ancestry mapping. Observe first that, for i 
and j such that 1 < i < j < d', we have 

^k+lMM ^ ^k+l,hi,hj' 

Thus, the interval associated with vj contains the one associated with Vi, as desired. 

In addition, recall that, for every i = 1, ... ,d' , Fi is mapped into Uk{Ii)- Therefore, if L[v) 
is the triplet of some node v £ Fi, then the interval associated with it is contained in /j. Since 
li C Ik+i hi h ^'^^ every j such that 1 < i < j < d' , we obtain that the interval associated with 
V is contained in the interval associated with vj . This completes the proof of Claim 13.31 

From Claim 13.31 we get that there exists an ancestry mapping of any F £ ^{n, d) into 
U. We use this ancestry mapping to label the nodes in F: an ancestry query between two 
labels can be answered using a simple interval containment test between the corresponding 
intervals. The stated label size follows, as each node can be encoded using log|[/| bits, and 
\U\ = Fiogn = 0{nd^). This completes the proof of the theorem. □ 
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4 A compact adjacency labeling scheme and a small universal 
graph for T{n, d) 



The ancestry labeling scheme described in the previous section can be advantageously trans- 
formed into an adjacency labeling scheme for trees of bounded depth. Recall that an adjacency 
labeling scheme for the family of graphs ^ is a pair (A^,P) of marker and decoder, satisfying 
that if L{u) and L{v) are the labels given by the marker M to two nodes u and v in some graph 
GeG, then 

T)[L{u), L[v)) = 1 <^=^ u and v are adjacent in G. 

Similarly to the ancestry case, we evaluate an adjacency labeling scheme (A4,I?) by its label 
size, namely the maximum number of bits in a label assigned by the marker algorithm M to 
any node in any graph in Q. 

For any two nodes u and f in a rooted forest F, u \s a, parent of v if and only if u is an 
ancestor of v and depth{u) = depth{v) — 1. Also, -u is a neighbor of v if and only if either u is a 
parent of u or f is a parent of u. It therefore follows that one can easily transform any ancestry 
labeling scheme for J-{n, d) to an adjacency labeling scheme for J-{n, d) with an extra additive 
term of log d bits to the label size (these bits are simply used to encode the depth of a vertex) . 
Using Theorem 13.11 we thus obtain the following. 

Theorem 4.1 There exists an adjacency labeling scheme for J^(n,d) of size logn + Slogd + 
0(1). 

Interestingly enough, this latter adjacency labeling scheme enables to give a short implicit 
representation (in the sense of [16]) of all forests with bounded depth. Recall that a graph G is 
an induced subgraph of a graph U if there exists a one-to-one (but not necessarily onto) mapping 
(j) from V{G) to V{U) such that 

yu,veV{G), {u,v}eE{G) ^ {(j){u),(l){v)} e E{U). 

Given a graph family Q, a graph U is universal for Q if every graph in Q is an induced subgraph 
of U. Note that a variant of this notion considers the graph U as universal for Q whenever 
every graph in ^ is a partial subgraph oiU, i.e., the existence of an edge between 0(m) and (j){v) 
in E{U) does not necessarily imply the existence of the edge {u,v}. This variant enables to 
analyze universal graphs for infinite graph classes [29] . The notion of universality considered in 
this paper is somewhat more restrictive, but it enables to relate the size of a universal graph for 
Q with the size of the graphs in Q. Moreover, this notion of universality precisely captures the 
structure of the graphs in Q. In fact, there is a tight relation between this notion and adjacency 
labeling schemes: 

Lemma 4.2 (S. Kannan, M. Naor, and S. Rudich [TB] ) 

A graph family Q has an adjacency labeling scheme with label size k if and only if there exists 
a universal graph for Q, with 2^ nodes. 

Combining the lemma above with Theorem 14.11 we get the corollary below. 

Corollary 4.3 Let d be a constant integer. There exists a universal graph for T{n,d), with 
0{n) nodes. 

Proving or disproving the existence of a universal graph with a linear number of nodes for 
the class of n-node trees is a central open problem in the design of informative labeling schemes. 
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